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Abstract 

We consider a peak-power-limited single-antenna block- stationary Gaussian fading channel where neither the 
transmitter nor the receiver knows the channel state information, but both know the channel statistics. This model 
subsumes most previously studied Gaussian fading models. We first compute the asymptotic channel capacity in the 
high SNR regime and show that the behavior of channel capacity depends critically on the channel model. For the 
special case where the fading process is symbol-by-symbol stationary, we also reveal a fundamental interplay between 
the codeword length, communication rate, and decoding error probability. Specifically, we show that the codeword 
length must scale with SNR in order to guarantee that the communication rate can grow logarithmically with SNR 
with bounded decoding error probability, and we find a necessary condition for the growth rate of the codeword 
length. We also derive an expression for the capacity per unit energy. Furthermore, we show that the capacity per 
unit energy is achievable using temporal ON-OFF signaling with optimally allocated ON symbols, where the optimal 
ON-symbol allocation scheme may depend on the peak power constraint. 

Index Terms 

Wireless channels, Noncoherent capacity, Capacity per unit cost, Block fading 

I. Introduction 

The capacity analysis of noncoherent fading channels has received considerable attention in recent years since it 
provides the ultimate limit on the rate of reliable communication on such channels. 

Proposed approaches to modeling noncoherent fading channels can be classified into two broad categories. The 
first is to model the fading process as a block-independent process. In the standard version of this model [1], the 
channel remains constant over blocks consisting of T symbol periods, and changes independently from block to 
block. The second is to model the fading process as a symbol-by-symbol stationary process. In this model, the 
independence assumption is removed, but the block structure is not allowed. Somewhat surprisingly, these two 
models lead to very different capacity results. For the standard block fading model, the capacity is shown [1], [2] to 
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grow logarithmically with SNR, while for the symbol-by-symbol stationary model, the capacity grows only double- 
logarithmically in SNR at high SNR if the fading process is regular [3]-[5]. For symbol-by-symbol stationary 
Gaussian fading channels, if the Lebesgue measure of the set of harmonics where the spectral density of the fading 
process is zero is positive, the fading process is nonregular and the capacity grows logarithmically with SNR [6]. 
This result is consistent with the capacity result for block-independent fading channels in the sense that the log SNR 
behavior in the high SNR regime results from the rank deficiency of the correlation matrix of the fading process. 
This point was elucidated in [7] where a time-selective block fading model was considered in which the rank of 
the correlation within the block is allowed to be any number between one and the blocklength. 

However, the mechanisms that cause the rank deficiency in the block-independent fading and nonregular symbol- 
by-symbol stationary models are different. For the block-independent fading model, the rank deficiency happens 
within each block. But for the nonregular symbol-by-symbol stationary fading channel model, the correlation matrix 
of the fading process over any finite block can still be full-rank; the rank deficiency in this case is in the asymptotic 
sense. In general, the rank deficiency of the correlation matrix can be affected by both the short timescale correlation 
of the fading process as in the block-independent fading model and large timescale correlation as in the symbol- 
by-symbol stationary channel model. In order to capture both of these effects, we model the fading process as a 
block-stationary Gaussian process. 

The block-stationary model was introduced and justified in [7]. We summarize the main points of the justification 
here. In the block-independent fading model, the channel is assumed to change in an i.i.d. manner from block to 
block. The independence can be justified in certain time division or frequency hopping systems where the blocks 
are separated sufficiently in time or frequency to undergo independent fading. The independence assumption is also 
convenient for information-theoretic analysis as it allows us to focus on one block in studying the capacity. If the 
blocks are not separated far enough in time or frequency, the fading process can be correlated across blocks and 
the block-stationary model is more appropriate in this scenario. Without time or frequency hopping, the channel 
variations from one block to the next are dictated by the long term variations in the scattering environment. If we 
assume that the variations in average channel power are compensated for by other means such as power control, it 
is reasonable to model the variation from block to block as stationary and ergodic. 

Remark 1: The block-stationary model does not imply that the fading process is stationary on a symbol-by- 
symbol basis as in the analysis of [3], [6]. But as explained in [7], the symbol-by-symbol stationary model is not 
realistic for time intervals that are larger than that corresponding to a few wavelengths. For this reason it may be 
more accurate to model the fading process using a block fading model with possible correlation across blocks than 
it is to model it as a symbol-by-symbol stationary process. From the viewpoint of analysis, the block-stationary 
model generalizes all previously considered models discussed above and therefore so do the capacity results for this 
model. More importantly, the block-stationary model provides us with a framework to study the interplay between 
many aspects of fading channels which are not captured in the aforementioned models, and allows us to identify 
the properties that are shared by different models and the properties that depend on channel modelling. 

The channel capacity for the block-stationary model was only studied in [7] under certain constraints on the 
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correlation structure across blocks, which essentially disallow rank deficiency over the large timescale. In this 
paper, we conduct a more complete study of the capacity for this channel model. 

II. Notation and System Model 

A. Notation 

The following notation is used in paper. For deterministic objects, uppercase letters denote matrices, lowercase 
letters denote scalars, and underlined lowercase letters denote vectors. Random objects are identified by corre- 
sponding boldfaced letters. For example, X denotes a random matrix, X denotes the realization of X, x denotes 
a random vector, and x denotes a random scalar. For simplicity, sometimes we also use x n to denote the random 
vector (xi,X2, ■ ■■ ,x n ) T . Although uppercase letters are typically used for matrices, there are some exceptions, 
and these exceptions are noted explicitly in the paper. The operators dct, tr, *, T, and f denote determinant, trace, 
conjugate, transpose and conjugate transpose, respectively. We let Im denote the M x M identity matrix for any 
positive integer M, and let var(a|b) denote E[(a — E(a|6))(a — E(a|6))t] for random vectors a and b. 

B. System Model 

We consider a discrete-time channel whose time-t complex-valued output y t g C is given by 

y t = h t x t + z t (1) 

where x t g C is the input at time t with peak power constraint \x t \ 2 < SNR; {h t } models the fading process; 
and {z t } models additive noise. The processes {h t } and {z t } are assumed to be independent and have a joint 
distribution that does not depend on the input {x t }. We assume that {z t } is a sequence of i.i.d. circularly 
symmetric complex-Gaussian random variables of zero mean and unit variance, i.e., z t ~ CAf(0, 1). We assume 
that the fading process {h t } is a block-stationary process with h t ~ CAf(0, 1) and block length T, i.e., {h k = 
(hkT+i,hkT+2, ■ ■ ■ ,hkT+r) T }k is a vector-valued stationary process. Furthermore, we assume that {h k } is an 
ergodic process with a matrix spectral density function S(e 7 ' w ), — tt < uj < n. Specifically, 

oo 

S(e ju ) = 

i— — oo 

where R(i) = Eh k h\,_ i . Since R(i) = F$(—i), i g Z, it is not hard to check that S(e? u ) is Hermitian, i.e., 
S(e juJ ) = S\e^). Moreover, we have S(e JiAj ) ^= (— ir < lo < tt), i.e., S(e juJ ) is a positive semi-definite matrix. 

There is an interesting relation between the matrix spectral density function and the asymptotic prediction error. 
Specifically, for the block stationary process + -^=2 t j, define the following prediction error covariance 
matrices: 

((, 1 - 1 1 r 1 ,o 



S (SNR) = var M fcl + _ a!l , fca + _ Za ,.., hT + _ Zr 



ht H ; z t 

VSNR 



S(oo) = wav^(h 1 ,h 2 ,- ■ ■ ,h T ) T {h t }° t= _Jj . 
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Then E(SNR) and £(oo) are related to the matrix spectral density function S(ef u ) of {h t } as [8] 

detp(SNR)] = expj^y" logdet S( e ^) + J-/ T da,}, (2) 

det[£(oo)] = expj^ ^ logdet [S(e ju )) dcjj . (3) 

The remainder of this paper is organized as follows. In Section |in| we establish single-letter upper and lower 
bounds on channel capacity, and use these bounds to analyze the asymptotic capacity in the high SNR regime. In 
Section IIVI we discuss the robustness of the asymptotic capacity results, and the interplay between the codeword 
length, communication rate and decoding error probability. In Section|VJ we adapt the formula of Verdu for capacity 
per unit cost [9] to our channel model, and use it to derive an expression for the capacity per unit energy in the 
presence of a peak power constraint. We summarize our results in Section IVII 

III. Asymptotic Capacity at High SNR 
We denote the capacity with peak power constraint SNR by C(SNR). For any n G N and SNR > 0, let 



D„(SNR) = lx n e C n : max \x t \ z < SNR 

I l<t<n 

Let ■pn(SNR) be the set of probability distributions on D„(SNR). Since the channel is block- wise stationary and 
ergodic, a coding theorem exists and we have 

C(SNR)= lim sup -I(x n ;y n ). 

lwo ° P a ,«e'P„(SNR) n 

A. Lower Bound and Upper Bound 

To derive a lower bound on C(SNR) for the channel model given in Q, we adopt the interleaved decision- 
oriented training scheme proposed in [10] with some modifications. This scheme can also be viewed as a way of 
interpreting the computations in [3, Sec. IV.E], [6, Sec. V]. 

Let p(x) be a circularly symmetric distribution with ||a;|| 2 £ [^min' SNR]. Construct the codebook C = C\ x C2 X 
• • • x Ck with K sub-codebooks €1,62, • • • ,Ck, where codebook d (i = 1, • • • , K) contains 2 nRi codewords of 
length n generated independently symbol by symbol using distribution p(x). We assume that if is a multiple of 
the block length T, i.e., K = rT for some positive integer r. 

Now we multiplex (or interleave) these K codebooks. Specifically, codebook Cj (i = 1, • • • , K) is used at time 
instants i.i + K,i + 2K, ■ ■ ■ ,i + (n — 1)K. For codebook C,, its codeword can be successfully decoded if 

Rx<\l{{xx +jK }%h{y 1+jK }U) 

for sufficiently large n. Furthermore, using the facts that {^l+j^KCo wq and that the channel is stationary 
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over time instants 1, 1 + K , 1 + 2K, ■ ■ ■ , 1 + (n — we get 



1 1 " _1 

-/ ({zi+j-A'j-p) 1 ; {yi+jic}"=o) = - X 1 i x ^+oK\ {vi +1 kYI=q \i x i+iKYlZ j+i) 

3=0 
^ n—1 

= ( Xi +J k; {yi+iK}?=o> {^i+iA'}- l =/+i) 

^ n— 1 

- - X] 7 ( Xl +J-^ 5 fi+j-Rr) 



J=0 

^(»i;yi) 



(4) 



This is to be expected since a channel with memory has a higher reliable communication rate than the memoryless 
channel with the same marginal transition probability. Thus, reliable communication at rate Ri = I(xi]y 1 ) is 
possible for sub-codebook C\ . After {x\+jKYj=Q i s successfully decoded, the receiver can use these values as well as 



{Vi+j k}]=o to estimate {h 2+jK }^. Specifically, (x 1+jK ,y 1+jK ) is used to estimate h 2+jK , j = 0, 1, • 
Since ||a;|| 2 G [x^ lin , SNR], it is easy to verify the following Markov chain condition 

1 



, n—1. 



hi+jK H —Zi+jK — * (xi+jKiUl+jK) — * h 2+]K ■ 



To facilitate the calculation, we assume that h 



1+jK ■ 



estimate 



l 2+jK 



^~ z i+jK is used to estimate h 2 +jK by forming the MMSE 
,n—l. The receiver decodes the codeword in codebook 



hi+jK + -^—z 1+jK y j = o,i,' 

C 2 using |e [h 2 +jx j^i+jK + 7^— j | . as side information. Successful decoding is possible if 

1 



R * ^ \ J2 1 (i x ^KY;zh{y2 +3K Y]=l {e (h 2+jK 



h 



l+j K 



-Zl+jK 



i=°. 



Similar to @, we can use the lower bound 



n-l 



3=0 \ 

> I (x 2 ;y 2l E^h 2 
E[h 2 



hi + 



hi 



+jK 



3=0, 



-Zl 



I x 2 ;y 2 



hi + 



-zi 



Eh 



hi + -^—Zi j j is possible for sub-codebook 



to show that reliable communication at rate R 2 = I \ x 2 ; y 2 
C 2 . By applying this procedure successively, we can conclude that for codebook C,*, reliable communication is 
possible at rate 

/ ^ 1 \ \ 



Ri = I\ XiWi 



E h, 



h 



3 rf . "i 

A min 



1,2,. •• ,K. 



Thus, using this interleaved decision-oriented training scheme, we can have reliable communication at overall rate 
of 

1 



1 K 1 K ( ( 

i=l i=l \ \ 



h i 



i =1 . 
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We show in Appendix [Q that j/ (x i+jT ; y i+jT E (hi+j T \hk + x~ z k} kl 



is a monotone in- 



creasing sequence with 



lim I x l+jT -y i+jT 

J^OO \ 



E hi+jT 



hi 



-z k 



i+jT-V 



k=l 



h k 



-Zk 



k——oo 



Now we let K go to infinity (i.e., we let r — > oo since T is fixed), and we obtain 



lim R 

K-+00 



E hi 



i=l \ 
x T r-l / 



i 



i T 

1 T 



i=l j=0 

r-l 



E /' , . , / 



i-i 1 



"Zfc 



lim -V / ( Xi+jTlVi+ji 



E ft 



i=l 



E /), 



lf r / , 



h k + 



hi 



-z k 



h k + 



i+ 3 T 



k= — oo 



hi 



-z k 



i+jT-l \ 

fc=l / , 
i+iT-1' 

k=l 



-z k 



(-1 



oo 



This yields the single-letter lower bound 



i=l V 



E ft,., 



1 



-z k 



(5) 



A:— — oo / J 

where x\,x%, - ■ ■ , xt all have the same distribution p(x), which is to be optimized later. 

Remark 2: Although channel estimation and communication are intertwined in this interleaved decision-oriented 
training scheme, the effect of channel memory is isolated from channel coding through interleaving. This is 
because when K is large enough, hi,hi+x,hi+2K,' • ■ ,fti+(„_i)if are roughly independent. Thus the codeword 
in codebook C;, which is transmitted over time instants i,i + K,i + 2K,- ■ ■ , i + (n — 1)K , essentially expe- 
riences a memoryless channel. This also suggests that as K goes to infinity, the single-letter lower bound (jSJi 
provides a correct estimate of the rate supported by this interleaved decision-oriented training scheme. We can see 
that the channel memory manifests itself in the lower bound (0 only through E I h 



Furthermore, in (jSJi, we can write hi as the sum of two independent random variables: the coherent fading 



component E I h 



hi 



hi 



k— — oo 

which is unknown. Isolating the effect of channel memory facilitates the 



lhk + — V~ z k\ ] which is known the the receiver, and the non-coherent fading component 

I mln J k— — oo / 



k— — oo 

channel code design: we only need to design channel codes for memoryless fading channels with different coherent 
and non-coherent components, instead of designing different codes for channels with different memory structures. 
To derive a single-letter upper bound on C(SNR), we follow the approach in [6]. The capacity C(SNR) is given 

by 

C(SNR)=lim sup -I(x n ;y n ). 

n ^°° P„»£P„ (SNR) n 
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By the chain rule, 

n 

i(x n -y n ) = Y J i{x n ;y k \y k ~ 1 )- 

fc=l 

We can upper-bound I(x n ;y k \y k ~ 1 ) as 



/(ai*,/" 1 ;^) 



< 7^a; fc ,ft fc _i 

< I Ufc, /if 



1 



/SNR 
1 



Zfc-i, • • • , fti 
fc-i \ 



/SNR 



t— — oo 



Since x*., E \ h 



fc-1 



it follows that 



J a?*,, < ft 



/SNR 



fc-i 



%/SNR 

is a sufficient statistic for estimating y k from ^a^, |ft t 

1 



(6) 
(7) 



/SNR 4 J 



fc-1 
t— — oo 



/SNR 



fc-i 



Note that by the block stationarity of the fading process, 



I x kl E h k 



h t + 



1 



/SNR 



fc-i 



depends on k only through (k mod T). Therefore, we obtain the single-letter upper bound 

fc-i \ 



C(SNR) < sup l(x k ,E[h k lh t + -±= 



Zi 



Wk 



(8) 



B. Asymptotic Analysis 

Now we proceed to show that the lower bound and upper bound (|8j together characterize the asymptotic 
behavior of C(SNR) in the high SNR regime. 

Lemma 1: For every £ g [£oi£i]> let be an M x M symmetric positive semidefinite matrix. We have 

loge ^ 
where /i(rank(A(£)) = i) is the Lebesgue measure of the set {£ : rank(yl(£)) = i}. 
For the special case where A(£) = A for all £ <E [£q, £i] an d £i ~~ Co = 1> we get 

logdet[A + e/ M ] 

lim = M — rankM). 

e->0 loge 



Proof: See Appendix ITT1 
Lemma 2 ( [6], Sec. V): If a; is uniformly distributed over the set |z e 



/SNR 



< z < 



a/SNr}, ft ~ CW ^0, E ft 2 j , 



h~CN (0,E ft, 

7 fx; (ft + ft 



z ~ CAf(0, 1), and a?, ft, ft, 2 are all independent, then 

2 8 



x + z 



ft > - log E 



ft 



5SNR 



log 1 — E 



ft 



7 -log 



he 



(9) 
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where 7 is the Euler constant. 

Theorem 1: For the block-stationary Gaussian fading channel model given in Q, 

T 



lim 



C(SNR) 



SNR^oo log SNR SNR- 



- logdet [S(SNR)] = 1 _ = 

ir^oo T log SNR 2?rT^ v v v " ; 

to j=0 



10) 



Remark 3: The second equality in i ll Ob follows from and Lemma ^ 

Proof: Below we provide an intuitive explanation of this theorem based on the lower bound (|5j- The details 
of the proof are left to Appendix Mil 



In the lower bound (0, let Xi be uniformly distributed over the set |z e C 

{ h " + Tm x *) l J_S)- Su PP° seE 

We can then write y i as y, = h\xi + Wi where Wi = hiXi + Zi with E \wi\ 



/SNR 



< z < 



hi as hi — hi + hi where hi = E ( hi 



Vsnr}, 



and write 



hi 



SNR 1 



SNR- r \ i = 1,2,- •• ,T. 
~ Ti . By viewing y i as the 



output of a coherent fading channel with the fading hi known at the receiver and noise Wi, we get 

i—l \ \ 



I I Xi 5 hiXi 



E hi 



h,, 



VSNR 



Zk 



k— — oo 



1 1 Xi\ hiXt + Wi 



hi 



log; 



SNR 



SNR 1_ri 
= r t log SNR. 



Thus the lower bound (0 can be approximated by 



^Y, 1 [ X i> h i 



E hi 



hi 



SNR%, 



1 T 

- n log SNR. 

»=i 



We then complete the proof by showing that J2 i=1 U is related to the matrix spectral density function S{e^) by 
the equation 

T T 
2^ 



E r * = X> ~ (nmk(S(e*")) = 



i=0 



Theorem[Ogeneralizes many previous results on the noncoherent capacity for Gaussian channels in the high SNR 
regime as we illustrate in the following subsection. 



C. Previous Results as Special Cases of Theorem 
Example 1: Constant Fading within Block 

For the special case where the fading remains constant within a block, i.e., hkT+i = hkr+2 = ■ ■ ■ = h^T+T, 
for all k E Z, all the entries of R(i) for any fixed i are identical. This implies that, for any fixed u>, all the entries 
of S(e^ u ) are identical, which we shall denote by s(e?"). It is easy to see that s(e J ") is essentially the spectral 
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density function of {h kT } k . The rank of S(e^) is 1 if s(e juJ ) > 0, and is if s(e^) = 0. We therefore have 



Urn 



C(SNR) 



1 T 

, r _ — V(T - i)u(rtmk(S(e ju; )) = i) 

1 1 



2ttT 
1 - 



i=0 

li{s(e?") > 0) 
2^T ' 



(11) 



When T = 1, we recover the result in [6] that 

C(SNR) _ /i(s(e J ") = 0) 

SNR-^oo log SNR 2lT 

which illustrates the effect of large timescale correlation of the fading process on the pre-log term of the channel 
capacity in the high SNR regime. When the fading is independent from block to block, we have fj,(s(e? u ) > 0) = 2n, 
and thus recover the result in [1], [2] that 

C(SNR) T-l 

lim = 

SNR^oo log SNR T 

which illustrates the effect of short timescale correlation of the fading process on the pre-log term of the capacity 
at high SNR. 

Example 2: Time- Selectivity within Block 

In this example, we recover the main result in [7] concerning the case where rank deficiency is caused purely 
by the correlation within a block. 

If rank(S(oo)) = rank(i?(0)), then 1 

C(SNR) T-rank(i?(0)) 



SNR™oo log SNR 



(12) 



To prove d!2l >. we first note that 



£(°°) + H^/t = var((h 1) h 2 ,...,h T ) T {fcj£ = _ UJ_J. 



SNR 



var hi 



1 



^ var I yfi! 
= S(SNR) 
^ var I h i 



VSNR 
1 

VSNR 



zi,--- ,h T 



1 



VSNR 
1 

VSNR 



SNR 



z T 



z T 



{**}*=- 



h k + 



/SNR 



Zk 



k=-c 



/SNR 



zi,- 



, hi 



VSNR 



1 



^ + SNR J - 



We therefore have the bound 



logdet[£(oo) + s^/ T ] < ^ logdetS(SNR) < ^ logdet [R(Q) + ^I T ] 

SNR^oo log SNR _ SNR^oo log SNR _ SNR^cc log SNR 



'The condition rank(E(oo)) = rank(i?,(0)) is satisfied, for instance, when the fading process is independent from block to block. 
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By Lemma ^ 



Um logdet [S(oo) + ^Jr] = ^ logdet [R(0) + &I T ] 

SNR^oo logSNR SNR^oo 



logSNR 
= -T + rank(i?(0) 



which implies that 



Therefore, by Theorem ^ 



lim 

SNR^oo 



lim 



logdetS](SNR) 
logSNR 

C(SNR) 



SNR^oo logSNR 



lim 

SNR^oo 



T + rank(i?(0). 
logdet E(SNR) 



Tlog SNR 
T-rank(i?(0)) 
T ' 



It is worth noting that in this case the pre-log term of the capacity can be achieved by a scheme simpler than 
the aforementioned interleaved decision-oriented training scheme. Suppose the rank of R(0) is Q, so that R(0) has 
Q x Q positive definite principal submatrix. Without loss of generality, suppose this submatrix is the covariance 
matrix of (hi, h%, • • • , ho) T . Then h k r+i can be represented as a linear combination of h k T+i, h k T 2 i 1 • • > h k T+Q 
for any k E Z and i E {Q + 1, Q + 2, • • • , T}. The simpler scheme is described as follows: 

The transmitter sends deterministic training symbols with maximum power at time instants kT + 1 , kT + 
2, • • • , kT + Q, i.e., x k T+i = x k T+2 = ■ ■ ■ = XkT+Q = vSNR, where k = 0, 1, 2, • • • . The receiver can form 



the MMSE estimates E yi kT+ i 
Clearly, we have 



kT+j 



/SNR 



ZkT 



for i = Q + 1, Q + 2, • • • , T and k = 0, 1, 2, • 



var h kT+i 



With the side information < E h k r 



hkT+j + 



\hkT+j 



'SNR 



'•kT+j 



/SNR 



y -kT+j 



Q N 

} a 

> .7=1 



reliably at time instants i,T + i, 2T + %,■■■ with rate at least / ^a;; ; y 
uniformly distributed over the set {z E C : SNR/2 < ||z|| < SNR}. By Lemma 

J'=l 



> at the receiver, we can communicate 

fc=o 



{^• + 7S«^L 1 ))- Leta! 



be 



E hi 



/ 'SNR 



> 



log 



var 



1 



SNR 



Q ' 



5SNR 



+ log I 1 — var ^/i 
= logSNR + o(logSNR) 



VSNR 3 J J= i 



- 7 - log -g- 



Threfore, the overall rate is lower-bounded by 

T 

x^y, E[ hi <J hi 



1 T 



i=Q+l 



1 



VSNR 



Q > 



T-Q 
T 



logSNR + o(logSNR) 



T-rank(i?(0)) 
T 



logSNR + o(log SNR) 
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and the pre-log term is achieved. This scheme has the following obvious advantages over the interleaved decision- 
oriented training scheme: (i) channel estimation and communication are completely decoupled; and (ii) channel 



estimation is done locally since the estimate E ( h k T-\ 
only depends h kT+1 ,h kT+2 , ■■■ , h kT+Q . 



{ h kT+j + ^m z kT+j } . =i J , i = Q + 1, Q + 2, • • • , T, 



D. Regular Block-Stationary Process 

The following theorem generalizes [3, Corollary 4.42] for regular Gaussian fading processes to the block-stationary 
case. 

Theorem 2: If det(S(oo)) > 0, then 

Urn [C(SNR)-loglogSNR] = -l-7-ilogdet(S(oo)) = -l- 7 --^ / logdet [S(e> u )] 6oj. (13) 

SNR^oo i Z7TJ J_ n 

Remark 4: The second equality in Jl 3I > follows from 0. 

Proof: See Appendix IIVI ■ 
Example 3: Gauss-Markov Process 

Suppose {h t } c ^_ oc is a Gauss-Markov process with E{h t +\h* t ) = p\ if (i mod T) = 0, and = p 2 otherwise. 
Here pi,p 2 are complex numbers with max(|pi|, \p2\) < 1- In this case, we have 

det(EM) = (l-| Pl | 2 )(l-| P2 | 2 ) T - 1 . 

Therefore, by Theorem |2] 

lim [C(SNR) - loglogSNR] = -1 - 7 - log(l - \p 2 f) - ^ = ^ = ^ = ^ . 

SNR— >oo 1 

IV. Symbol-by-Symbol Stationary Fading Model 

For simplicity, we assume in this section that the fading process is symbol-by-symbol stationary, i.e., T = 1. In 
this case, Theorem 1 is specialized to Equation il Q . 

A. Best- and Worst-Case Spectral Densities 

We can see that two fading processes with spectral density functions si(e : '") and S2(e J ") can induce the same 
pre-log term in the high SNR regime as long as ^(s^e- 7 ") = 0) = p(s 2 {e^) = 0). But in the non-asymptotic 
regime, the capacities of these two channels may behave very differently. So, for a fixed /i(s(e JW ) = 0), it is natural 
to ask the question: which spectral density function s(e :,w ) gives the largest (or smallest) channel capacity at a 
given SNR? This question is difficult to answer since we do not have a closed-form expression for noncoherent 
channel capacity. We therefore turn to the lower bound to formulate a closely-related problem. 

When T = 1, the lower bound can be reduced to 



C(SNR) > / xuhiXt+Z! 



E h x 



h k + — Zl } II . «!4) 



x 



mm 



k= — oo 
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We can see that the lower bound dl4> depends on s(e^ u ) only through E ( h\ 
if we fix the input distribution p{x\), then 



{ hk + S~ z 1 } k _ ) ■ Furthermore, 



var 
implies that 



h k + 



-zi 



fc— — oo 



< var hi 



hi 



I ^xx; h\X\ + Zx 
> I xx,h\X\ + zx 



E I hx 
E | hx 



h k 



h k 



-zx 



-zx 



-Zx 





k— — oo f 




k— — oo 



s 2 (ei") 



Sl (ei") 



j(6i-) 



We can therefore ask which s(e J ") gives the largest (or smallest) var ( hx 
precisely, since 



More 



var hi 



h k + 



-zx 



k=—oo 



var hx 



-zx 



»(ef«) 



-«1 



fc— — CXD 



»(e*») 



exp 



1 

2^ 



log 



do; 



we can formulate the problem in the following form: 



r 

arg max ( or arg min ) / log 



s(e^) + 



dw 



(15) 



subject to 



s ( e ^) > o, — / s(e^)dcj = 1, ^{s{e^) = 0) = a. 

27T 



M s(e*") = 



27r-a, Ai(«(e iw ) = 0) = a. 



where a £ [0, 2ir). Due to the strict concavity of log(-), it is easy to show that the maximizers of the optimization 
problem J15l > are the set of spectral density functions s{eJ ul ) with the property: 

1 

27r — a / 

This solution has the following interpretation. Without constraints on the spectral density function, the worst fading 
process is the i.i.d. Gaussian process whose spectral density function is flat. With the constraint /i(s(e :,w ) = 0) = a, 
the spectral density function s(e J<i ') cannot be completely flat, but the worst fading process should have a spectral 
density function that is as flat as possible, i.e., the correlation in the time domain should be the weakest possible. 
Note that the solution does not depend on x m i n . We can use this fact to derive a universal lower bound on C(SNR) 
for the class of spectral density functions {s(e^) : jLi(s(e? w ) = 0) = a], which has further implications for the 
high SNR asymptotic behavior of C(SNR). Let s meoc (e J ' u ') be a maximizer of J15I . We have 



var h\ 



h k 



-zx 



k— — oo 



2ir a £ m in 
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For any spectral density function s(eJ UJ ) with /i(s(e : > UJ ) = 0) = a, we have 



C(SNR)| 



> I Xi;hiXi + Z\ 



E h x 



hi, 



1 



-zi 



= / ( cc; ( h + ft ) x + z 



ft 



(16) 



where x is uniformly distributed over the set |z € C : x m { n < \z\ < VSNR j, ft ~ CAf (0, 1 — 0(a, x m i n )), ft. ~ 
C./V (0, 0(a, a; m i n )), z ~ C/V(0, 1), and a;, ft, ft, 2 are all independent. We can further optimize over p(x) to tighten 
the lower bound d!6i . 

The minimizers of dl5> do not exist. Consider the following set spectral density functions {sg(e : ' UJ )}e given by 







^1= h 



Me 



2 ^- 2 r +Qg+1 HG(7r-^,7r] 



where 9 > 9q with 



70 = 



2-7T — a 



(2tt - a) 2 < 8tt 



2 " a+V( 4 2 r" )2 ' - (27r-a) 2 >8^. 



We can compute 



lim 



log 



do; 



lim 



—2a log x n 
-47T log a; mi „. 



27r — a — — I log ( - 



1 /27t6> 2 — 27t6< + «6» + 1 



7; log 



Note that log s(< 



dm < — 47rlogx m i n . Therefore, as 8 goes to infinity, log s(e JW ) + dw 
approaches the lower bound that is not attainable by any spectral density function. Intuitively, the fading process 
associated with s^e- 7 ") becomes more and more deterministic as 9 gets larger, and it can be verified that 



lim var fti 



ftfc + 



1 



-z k 



= 0. 



se(e*") 



*mm J k=-c 

This result has interesting implications for the channel capacity. 
Proposition 1: For any r > 0, 

. f C(SNR)| (eju) a + min(r,l)(27r-a) 

limmf : jr-Tr ^ 

SNR^oo,9=SNR r logSNR 27T 



If r > 1, then 



Proof: See Appendix IV1 



lim 



C(SNR)| 



s e (eJ-) 



SNR^oo^SNR^ log SNR 



1. 
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Remark 5: Although by Theorem [2 for any fixed 9, the ratio between C(SNR)| se ( e3 a>) and logSNR converges 
to Proposition says that the convergence is not uniform with respect to 6. This is intuitively clear because 
when 9 is large, we have se(e^) ~ for \uj\ G [0, ir — jg]. Therefore it can be expected that for a large range of 
SNR, the channel capacity C(SNR)| S() ( e j^) behaves like (l — ^) logSNR, which could be significantly larger than 

log SNR. For the extreme case where a = 0, by Theorem [2] for any fixed 9, the capacity C(SNR)| s ( eJ ^) grows 
like log log SNR at high SNR. But Proposition [T]impl ies that even in this extreme case, the capacity C(SNR)| SeJ ( e jw) 
of some 9 can grow linearly with logSNR for a large range of SNR. This is consistent with the result in [10] where 
is was shown that for the Gauss-Markov process with E{hi + ih*) — p, the capacity C(SNR) grows like logSNR 
for a wide range of SNR levels if \p\ is close 1. An intuitive explanation is that if \p\ is close 1, the spectrum 

s{en = \zK 

V ' l-2Re(pe-^) + | / o| 2 

is approximately zero for all u> except those around zero, and we can expect from equation (^J that C(SNR) should 

grow like log SNR for a wide range of SNR. But it should be noted that as opposed to a Gauss-Markov process, 

a general Gaussian stationary process cannot be characterized by a single parameter, and the behavior of C(SNR) 

can be much more complicated as shown in the following example. 

Example 4: Consider the spectral density function 

ei \lu\ < TTCCl 

S8{e 3U1 )=\ e 2 M € (7rai,7ra 2 ] 

1 — ai€i — (oc2— CKl)e2 I I ^ / 1 

Wl ^ \u\ £ (7ra 2 ,7r] 

where < ol\ < «2 < 1> an d fi < £2 < 1 (Note: For two positive numbers a and b, a <§; b means ^ is much 
greater than 1). We show in Appendix I VII that ^ S g^ is approximately equal to oli for SNR < and gradually 
decreases to a\ as SNR approaches j-. 

This example shows that ^ S g^ can be highly SNR dependent. For a regular Gaussian fading process, it may 
require unreasonably high (impractical) values of SNR in order for the noncoherent channel capacity C(SNR) to 
grow like log log SNR, and the behavior of C(SNR) at moderate SNR levels may depend highly on the spectral 
density function. 

Overall, the above analysis suggests that great caution should be exercised when using the asymptotic results in 
Theorem ^ an d Theorem [2] to approximate the channel capacity C(SNR) at a finite SNR level. 

B. Finite Codeword-length Behavior 

Although the asymptotic capacity results might yield over-pessimistic approximations such as a log log SNR 
growth with SNR, they could also lead to over-optimistic conclusions. In the capacity analysis, it is assumed that 
the codeword is of infinite length. But when the length of codewords is finite, the situation can be dramatically 
different. By Fano's inequality, the communication rate R is upper-bounded by 

I(x n ;y n ) + 1 
~ n(l-Pe) 
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where n is the codeword length and P e is the decoding error probability. Suppose we fix n and P e , and let SNR go 
to infinity. For a symbol-by-symbol stationary Gaussian fading process, even if /i(s(e : '") = 0) > 0, the correlation 
matrix of the fading process over any finite block length can still be full-rank. Note that -^I(x n ;y n ) is upper- 
bounded by the capacity of a block-independent Gaussian fading channel with the correlation matrix of each block 
given by E[h n (h n ^]. Since E[h n (h n )^} is full rank, it follows from Theorem that ±I(x n ;y n ), and hence R, 
grows at most like log log SNR as SNR goes to infinity. Therefore, there is no nontrivial tradeoff between diversity 
and multiplexing in the sense of [11]. If we want R to grow linearly with log SNR while having the decoding error 
probability P e bounded away from 1, the codeword length n must scale with SNR. It is of interest to determine 
how fast the codeword n should scale with SNR in order to guarantee that the rate R can grow as log SNR with the 
decoding error probability not approaching 1. More precisely, letting the rate i?(SNR), codeword length n(SNR) and 
decoding error probability P e (SNR) all depend on SNR, we wish to determine conditions on n(SNR) to guarantee 
the existence of a sequence of codebooks (indexed by SNR) with rate i?(SNR) and codeword length n(SNR) such 
that 

lirninf \ > r 



SNR^oo log SNR 



and 



limsupP e (SNR) < P e 

SNR^oo 

where r > and P e E (0, 1). 

Now we proceed to derive a necessary condition on the growth rate of n(SNR). It follows by chain rule that 

n(SNR) 



/(a ,n(SNR) ;y „(SNR) )= £ j 



n(SNR) . 
x ! Vk 



k=l 



y 



k-l 



By (|6j, we can upper-bound / fa; Tl ( SNR ); yA y k 1 ) as 



/(> (SNR) ;y fe 



< 



< 



.fe-i 



sup I{x k ,h k ^ 

P* k eVi (SNR) 



/ SNR 



sup I(x k ;y k )+ sup 

-P^eTMSNR) P^GTMSNR) 

Since sup Fa!fce73l(SNR) I(x k ;y k ) = o(logSNR), and 

1 



= E<^log 



< log 



,hi 

i j //, fe _i + 
1 



/SNR 



z ^Vk 



/SNR 



Zk-l, ■ 



,h 1 



/SNR 



zi;vk 



VSNR 



VSNR 



1 + \x k \ 2 ■ var f h 



1 + SNR 



1 + SNR • var ft, 
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lo g( 1 tMp R ) -logvarl h k 



< log 



SNR 

1 + SNR 
SNR 



1 



logvar /i 



7 SNR 
1 



--z k 



VSNR 



*o 



h, 



h, 



1 



/SNR 
1 

/snr" 



fc-1 



U=l 

-1 



u=-ra(SNR) 



it follows by Fano's inequality and the condition limsup^p^^ P e (SNR) < P e that 

,. . p i?(SNR) . „ Z( £C »(SNR);y»(SNR)) + l 

lim mi < limmi ■ 



SNR^oologSNR ~ snr^oo n(SNR)(l - P e (SNR)) log SNR 



logvar ho 



< lim inf ■ 

SNR^oo 



lim inf ■ 

SNR^oo 



/SNR 



z 



=-n(SNR) 



(l-P e (SNR))logSNR 



logvar h 



/SNR 



ZQ 



-n(SNR) 



(l-P e )logSNR 



Therefore, in order for 



we must have 



lim inf '- > r, 

SNR^oo log SNR 



logvar ( ho 



lim inf ■ 

SNR^oo 



/SNR 



I vSNR J „=-n(SNR) / ,„ 
— — — > r 1 



(l-Pe). 



(17) 



Since — log var ho 



/SNR 



z 



log SNR 

i h v H — /hsZv r ) is a monotone increasing function of n, it is easy to see 
I VSNR j „ = _„ J 

d!7t implicitly provides us with a lower bound on the scaling rate of n(SNR). 

In order to derive an explicit lower bound on the scaling rate of n(SNR), we need to introduce a concept called 
transfinite diameter [12]. 

Definition 1: Let S be a compact set in the plane. Set 

V(zi,- ■ ■ ,z n ) = IJ(zj - Zk) " > 2, z; e 5, 
V n (S) = max \V(zi,- ■ ■ ,z n )\ 

and 



t«(5) = [7„(5)]^it. 
The transfinite diameter of S is defined by 

t(S) = lim r n (S). 

n — >oo 

We need the following facts regarding the transfinite diameter. 

(i) For two compact sets Si and 52 with Si C S2, we have t(<5>i) < t(iS>2)- 

(ii) The diameter of the unit circle is 1. More generally, the diameter of an arc of central angle 9 on the unit 
circle is sin (|); 

(iii) The transfinite diameter of any closed proper subset of the unit circle is less than 1. 
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A full discussion of the transfmite diameter can be found in [12]. 
Now return to the original problem. Since 



we have 



lim inf ■ 

SNR^oo 



var ho 



log var h 



1 



'SNR 



z 



h, 



1 



/SNR 



> var 



i(SNR) 



(h {h v } v l 



=-n(SNR) 



/SNR 



z 



{ h - + ^*"}„=-„(sw) ) - logvar (ho {M,=L«(s«) 

— < lim mi 

log SNR ~ snr^oo log SNR 



Let S — {e?^ : s(e^ u ) > 0}. It was shown in [13] that if the set S consists consist of a finite number of arcs of 
the unit circle, then 

}^o{ VaI \ ho {M«=-")) " = T ( 5 )- 

Under the conditions 

(a) The set S consists consist of a finite number of arcs of the unit circle, 

(b) The set S is a closed proper subset of the unit circle, 
it can be shown by using Facts (i), (ii) and (iii) that 

< t(S) < 1. 

Therefore, under Conditions (a) and (b), we have 

MsbpT lo s var ( h o | {M«i-»(s«) 



lim inf ■ 

SNR^oo 



logvar (ho {M„i-n(SNR) 



log SNR 



= lim inf ■ 

SNR^co 



lim inf ■ 

SNR^oo 



Rsky lo g SNR 

-n(SNR) logr(<S) 



log SNR 

It is clear that in order to guarantee that dl 8t i is greater than or equal to r(l — P e ), we must have 



(18) 



snr^oo log SNR ~ logT(SNR) 
which is a necessary condition on the scaling rate of n(SNR). 

In contrast, we show in Appendix IVHI and Appendix IVIIII that for the AWGN channel and memoryless coherent 
Rayleigh fading channel, it is possible to have the rate i?(SNR) grow linearly with log SNR with fixed codeword 
length n and bounded decoding error probability at high SNR. For these two cases, to facilitate the calculation, we 
adopt the average power constraint. But our main conclusion holds also under the peak power constraint. 



V. Capacity Per Unit Energy 

In the preceding sections, we focused on the channel capacity in the high SNR regime. Now we proceed to 
characterize the behavior of channel capacity in the low average power regime for the block-stationary Gaussian 
fading channel model. To this end, we shall study the capacity per unit energy (which is denoted by C P (SNR)) 
due to its intrinsic connection with the channel capacity in this regime. The following theorem provides a general 
expression for the capacity per unit energy. 
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Theorem 3 ( [9], [14]): 

C„(SNR) = lim sup — — „ . 

n ^°°x"eD Jl (SNR) II X n \\ 2 

Furthermore, the capacity per unit energy is related to the capacity by 

r fQMR't g(P.SNR) C(P,SNR) 

C P (SNR) = sup = hm 

p>o v K 

where C(P, SNR) is the channel capacity with average power constraint P and peak power constraint SNR. 

The following theorem is an extension of [14, Proposition 3.1] for the symbol-symbol stationary channel model 
to the block-stationary model. 

Theorem 4: For the block-stationary Gaussian fading channel model given in Q, 

C„(SNR) = 1 - - — min 9(M, SNR) 
py ' 2?rSNR mc{i,-,t} y ' ' 



where 



*(.M,SNR) = j^-J logdet + SNRS. M (e 3 ' w )] dw, 



and Sm{^ u ) is an x \M\ principal minor of S(e : ' u ') with the indices of columns and rows specified by M.. 

Proof: The proof is omitted since it is almost identical to that for the symbol-by-symbol stationary fading 
channel [14]. The only difference is that although the capacity per unit energy of the block-stationary fading 
channel can be asymptotically achieved by temporal ON-OFF signaling, we have to determine how to allocate ON 
symbols in a block. It can be shown that the optimal allocation scheme is given by M* , which is the minimizer 
of minjvfe-fi ... ,t} ^{M). Here M* might not be unique. ■ 
C p (SNR) is a monotonically increasing function of SNR. It is easy to see that C P (SNR) goes to 1 as SNR — > oo, 
and goes to as SNR — > 0. The following result provides a more precise characterization of the convergence 
behavior. 



Corollary 1: At high SNR, 



\M\ 

£ *M(rank(S A1 (e^)) = i) logSNR i g SNR 



CJSNR) = 1 - min ^ ttttt^ + o( * )■ (19) 

pv ' mc{i,-,t} 2tt|X|SNR v SNR ; 



At low SNR, if 



tr[S 2 (e^)] dcu < 



then 



SNR 1 f v 

Cp (SNR) = — m max ^ — tr [S 2 M (e*-)] dc + o(SNR). (20) 

Proof: By Lemma [T] at high SNR 



log det 



( \M\ 
SNR 



I\M\ + Sm{^ U ) 



duj = -^2(\M\-i)n(rnLnk(S M {e ju )) = i)logSNR + o(logSNR). 



i=0 
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Therefore, 

I 

2vrSNR mc{i™. ,r} 



= 1 — ,-. 3..^ mill { 7T-7T / logdet 
2ttSNR mc{i,... ,t} \ |M| 



duj + 27rlogSNR 



|(|.M|-^(ran k (^( e ^))^)lo g SNR 

1 + W c{lr,T} 2^|M|SNR SNR + ° ( SNR } 

|A<I 

E ^ (rankle*")) = i) logSNR 
t • j=0 . /log SINK. 

= 1 — mm h o . 

MC{i,...,T} 2tt|.M|SNR SNR 

At low SNR, using the second-order approximation [15], we obtain 

logdet \I\m\ + SNRS^(e^)] = tr[^(e^)]SNR - itr^e^SNR 2 + o(SNR 2 ). 

Therefore, 

1 

27rSNR mc{T" ,t} 



C P (SNR) = 1 - _ _min _ *(M, SNR) 



1 " h Mct*,T } \k\{[_* [SM{eJU1)]AbJ ~ \ //[^(^)]SNRd^} +«(SNR) 
SNR 1 /"^ 

— max — tr[^(e JaJ )]d W + (SNR) 
4?r mc{i,-,t} \M\ J_ n 



where the last equality follows from the fact that 

2 

Remark 6: Using the inequality 



^J\[S M (e>»)]*u> = l. 



logdet [I m + SNRS M (e? u )] > tx[S M (e? u )]SNR - ^[^(e^SNR 2 , 
we can upper bound C P (SNR) by 

SNR 1 r v 

C (SNR) < — max — / \x[S 2 M {e^)]Aw . 

It can be seen from Corollary ^that this upper bound is a good approximation of C P (SNR) in the low SNR regime. 
Now we proceed to compute C P (SNR) in the following examples. 

Example 5: When the channel changes independently from block to block, C P (SNR) is equal to 

1 " M^t. ,n MSNR logd6t ^ + SNRi? - (0) ] 
where i?^(0) is an \M\ x \A4 \ principal minor of i?(0) with the indices of columns and rows specified by Ai. If 
we further let the fading remain constant within a block, then all the entries of i?(0) are one. It is not difficult to 
show that 

1 ■ logdet [ ^ |+ SNR^rl- 1 ^ 1 + ' A1 ' SNR ) 



|7W|SNR b JViK ;J \M\SUR 
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which is minimized when \M\ = T, i.e., M. = {1, 2, • • • , 7 1 }. So we have 

rrsi\im-i lQ g(i + ^SNR) 
Cp(SNR) - 1 tsnr — 

as shown in [14]. 

Example 6: Consider the case in which the fading process satisfies the following conditions 

1) All the off-diagonal entries of i?(0) are equal to a, where a S C is a constant; 

2) All the entries of R(i) are equal to j3i for any non-zero integer i, where (3i S C is a constant that depends 
only on i. 

We also know that the diagonal entries of i?(0) are all one. So for any fixed lo {—it < ui < ir), all the diagonal 
entries of / + SNRS^e 3 ") are identical, and all the off-diagonal entries of / + SNRS^e-?") are identical. It then 
follows from Szasz's inequality [16] that for any u> £ [— ir, ir], 

{det[I lM] +SHRS M (e ju )}^ 

is minimized when M. = {1, 2, • • • , T}. In this case we therefor have 

C P (SNR) = 1 - f i og det (/ + SNRS(e*")) dw. 

If the fading remains constant within a block, then for any fixed lo, all the entries of S(e>") are identical, which 
we shall denote by s(e^). It can be shown that 

dot [/ + SNR5(e Ja) )] = 1 + TSNRs(e^), 

which yields 

C P (SNR) = 1 - f i og [i + TSNR s (e^)] dw. (21) 

We can see from \2\\ that C P (SNR) is a monotonically increasing function of T and SNR. Intuitively, as T gets 
larger and larger, the receiver can estimate the channel more and more accurately, and thus the capacity per unit 
energy of the non-coherent channel should converge to that of the coherent channel, which is equal to one; as SNR 
goes to infinity, C P (SNR) should also converge to one since flash signaling can be used if there is no peak power 
constraint (i.e., SNR = oo) [17]. Moreover, J2 1 i provides a precise characterization of the interplay between the 
coherence time and signal peakiness, stating that the capacity per unit energy is unaffected as long as the product 
of T and SNR is fixed. See [18], [19] for a related discussion. 

For the special case where the fading is a block Gauss-Markov process, i.e., all the entries of R(i) are equal to 
p % if i > and equal to if i < for some p E C with < \p\ < 1, we have 

l+TSNRs(e J ") = \cp(e juj )\ 

where 

(p*z - 7o) 2 



ip(z) = 



7o 



\P\ 2 (*-£)' 
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and 



To 



b+ v/fo 2 -4|p| 2 



with 6=1 + TSNR + |p| 2 (l — TSNR). The function ip is analytic and nonzero in a neighborhood of the unit disk. 
Thus, by Jensen's formula 

1 



C p (SNR) = 1 - 

= 1 - 
= 1 



2vrTSNR 
1 



log|(^(e J ")|dcj 



TSNR 
1 



log |y (0)| 
log 70, 



TSNR 

from which we can recover [14, Corollary 4.1] by setting T = 1. 

Finding the optimal M* is a difficult problem in general. Moreover, as shown in the following example, the 
optimal A4* may depend on the SNR level. 

Example 7: Let the fading process be independent from block to block with 





fl 


1 


A 


S(e> u ) = R(0) = 


1 


1 


P* 




V 


p 





where \p\ e [0, 1]. 

It is shown in Appendix IIXI that 
1) When < \p\ < \, the optimal M* is {1,2}, and 

1 



C p (SNR) = 1 



2ttSNR 



*({1,2},SNR) = 1 



log(l + 2SNR) 
2SNR 



2) When | < \p\ < 1, 



{1,2,3} S NR<-^ 
X* = <| {1,2} or {1,2,3} SNR=-^_W 
{1,2} SNR>-^2_ 



and 



C p (SNR) 



-, _ log(l+3SNR+2SNR 2 -2|p| 2 SNR 2 ) CMD ^ 2|p|-l 

1 3 SNR olm ^ 2(l-|p|) 2 

, log(l+2SNR) cmd > 2|p| — 1 . 

1 2SNR bNK - 2(l-|p|) 2 ' 



3) When \p\ = 1, the optimal M* is {1,2,3}, and 

1 



Cp(SNR) = 1 



2vrSNR 



*({1,2,3},SNR) = 1- 



log(l + 3SNR) 
3SNR 
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It can be verified that this result is consistent with the asymptotic analysis in Corollary Since Sm{^ u ) = 
-Rm(O) for any M C {1, • • • , T}, it is easy to see that 

\M\ . 

X)y^/i(nmk(S A4 (e J ' w ))=i) 

i— 

is minimized at .M = {1, 2} if \p\ < 1, and minimized at M = {1, 2, 3} if \p\ = 1. Therefore, by ( H9L the optimal 
AT at high SNR should be {1, 2} if \p\ < 1, and should be {1, 2, 3} if |p| = 1. Since 



S{i,3}(0 = 5 { 2 2 . 3 }(^) 



^1,2,3} = 2+|p| 2 2+|p| 2 

v 3p 3p l + 2\p\ 2 J 

it follows that j-^rtr [5jt ( (e 3 ' a ')] is maximized at M = {1,2} if \p\ < |, and maximized at = {1,2,3} if 
\p\ > |. Therefore, by ®, the optimal M* at low SNR should be {1,2} if \p\ < ±, and should be {1,2,3} if 
\P\ > i 

Intuitively, if \p\ is close to 1, we can approximate i?(0) by the all-one matrix, and then it follows from Example 
|5]that the optimal A4* is {1, 2, 3}. The approximation breaks down at high SNR since Corollary ^ implies that the 
optimal M* should be {1, 2} as SNR — > oo. 




VI. Conclusion 

We conducted a detailed study of the block-stationary Gaussian fading channel model introduced in [7]. We 
derived single-letter upper and lower bounds on channel capacity, and used these bounds to characterize the 
asymptotic behavior of channel capacity. Specifically, we computed the asymptotic ratio between the non-coherent 
channel capacity and the logarithm of the SNR in the high SNR regime. This result generalizes many previous 
results on noncoherent capacity. We showed that the behavior of channel capacity depends critically on channel 
modelling. We also derived an expression for the capacity per unit energy for the block-stationary fading model. It is 
clearly of interest to generalize these results to the multi-antenna scenario, but such an extension seems technically 
nontrivial. 

Another direction that we explored was the interplay between the codeword length, SNR level, and decoding 
error probability. We showed that for noncoherent symbol-by-symbol stationary fading channels, the codeword 
length must scale with SNR in order to guarantee that the communication rate can grow linearly with log SNR with 
bounded decoding error probability, and we found a necessary condition for the growth rate of the codeword length. 
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We believe that a more complete characterization of the interplay between the codeword length, SNR level, and 
decoding error probability is of both theoretical significance and practical value. 



Appendix I 
Proof of Monotonicity 



By the block stationarity of the fading process, we have 



I \^r, . ,,: y, . E I h 
Since for any j\ < j 2 , 



i+jT 



hi 



1 



-Zk 



-mm J k=l 



i+jT-1' 



h k 



1 



-z k 



hi 



-z k 



(22) 



fe=l-jT 



k=l-j 2 T J J \ 

form a Markov chain, it follows by data processing inequality that 

i-l \ \ / 



h k H z k 

•^miii 



h; 



hi 



1 



-Zk 



fc=i-iiT / 



h, 



i-l 



fe=i-iiT / 



i-l 



(23) 



Equations i22i and d23i together imply that ^ 7 | aJj+jT ; Ui+jT 
monotone increasing sequence 

For every E ^h 
of everything else such that 

1 



E /(,.// 



k=l-j 2 T / 
i+jT-1 



{ h * + —Mk^ 



+ ^-2fc| ^ ). v\e can construct a random variable A, -~ CA tn.rf, ) independent 



E //., 



h h 



Zk 

min ) k=\-jT 



E // 



E hi 



hi 



-z k 



L, mm ) k— — 



in distribution. Clearly, Sj — > as j > — > oo. Moreover, it is not difficult to show that 



E 



1 



Combining J22i and J24l >. we get 



-Zk 



fc=l-jT 



E hi 



lim I x i+jT ;y i+jT 



h 



i+jT 



-z k 



-z k 



i+jT-1 ' 



(24) 



k=— oo 



lim 7 Xi\ y t 

j-too \ 



= lim 7 tc^yi 



E hi 



E fe< 



1 



i 



-z k 



X 



-Zk 



mm J fc= _ 



i-min J k=l 



k=l-jT , 
i-l \ 



Since 



E hi 



7 yx l ;y 
= I (x^y^Elh 
= hly^Elh 



h k 



hi 



-Zk 



-Zk 



k— — oo 
i-l s 



A., 



'■'mm J k=-c 
i-l \ 



hi 



-z k 



hi y l .E\h l 



k= — oo 



h k H -z k 

in 



k— — oo 
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By [3, Lemma 6.11], we get 



lim h I yj,E I hi 

j->oo \ \ 



hi 



-Zk 



b min ) k 



1 



h k H z k 

x 



mm ) k—— 



Since conditioned on Xi, [y il E I h 



fe=-c 
12 



h^Elhi 
- Aj I are jointly Gaussian with uniformly bounded 



differential entropy for any realization of cc; (Note: |a;i| 2 < SNR), it follows by dominated convergence theorem 
that 



lim h y it E h t 
Therefore, 



lim / x i+jT ;y i+j T 



h k H — z k 

x 



mm ) k— — 



x i\ = M y^Elhi 



h k H — Zk 

x 



mm ) k =- 



E h i+ jT 



hi 



-Zk 



ZD 

Appendix II 



lim / X f, y t 



hi 



hi 



-z k 



k=—oo 



Proof of Lemma[0 



By eigenvalue decomposition, we write 



and 



A(£) + eI M = U(£)(D(S) + eI M )U\0 



where U (£) is a unitary matrix, and D(£) is a diagonal matrix with nonnegative diagonal entries. Since rank(A(£)) 
rank(D(£)), define 



We have 



= {£ : mnk(A(0) = rank(D(£)) = i}, 0<i< M. 



frlogdet[A(0 + eI M }dt L^ 3 log det [£>(£) +e/ M ]d£ 

lim — 2 : = lim — 2 

e^o loge e— o loge 



M 



= lim ■ 



E/ n .logdet [D(0 + eI M ]d£ 

i=0 

loge 



For £ € Oj, (possibly after permutating diagonal entries) we can write -D(£) = diag{c? 1 (^), • • • , <ij(£), 0, • • • , 0}, 
where > 0, 1 < j < i. Therefore, 



M 



j^ ]ogdebm) + elM] ^ -EJ n Slog^ffl + ^K m 

lim — 2 ^ = lim : 

e-»0 loge e-+o loge 



£(M - i)/i(rank(A(0) = i) 



where /i(rank(A(£)) = i) is the Lebesgue measure of fi,*. By the argument in [6, Section VIII], it can be shown 
that 

lim — = 0. 

e^o log e 
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So we have 



lim ^ _^ . = J2(M i) M (rank(A(0) = 0- 



Appendix III 
Proof of Theorem[2 



Define 



CTj(SNR) =var h t 



VSNR 



VSNR 



Zk 



i-l 



k= — oo 



i= 1,2,.-. ,T. 



In the lower bound Q, let CEj be uniformly distributed over the set j z 6 



/SNR 



< z < 



VSNR}. By 



Lemma 



2 



> -log 

= -log 



a, m\ - 



(hi 


{ 


12 




5SNR 


12 




5SNR 



hfc + SNR Z * 



i-l 



k=-c 



log 



/SNR\ 4 \ , 5e 

1 _ CT i —r + ^H^TH _ 7 - log T 



V 4 / SNR y 



o(logSNR). 



Since ^ iff) > it follows that 



lim inf ■ 

SNR^co 



1 1 x^ : h^Xi -\- Zi 



E 



(hi\{hk + sMR^fc}L- 



> lim inf ■ 

log SNR ~ snr^oo log SNR 

Let S (^) = L (^) A (^) Lt (Snr^ where £(SNR) is a lower tr i angu i ar matr i x with unit diagonal entries, 



logh(^)] 



andA(^)=diag{ < 7 1 (^), C r 2 

. r /SNRV 
det J 



SNR^ 



,a T (^)}. We have 



= dct 
= det 

T 





( SNRV 




A 


'SNRV 





V 4 J 



V 4 



(25) 



Therefore, 



lim inf 



C(SNR) 



_ ; lim inf 1 

SNR^oo log SNR SNR^oo 



I (x l ;h l x l + Zi E(hi {h k + ^Zk} k J_ 
i=i v v 



T log SNR 



log 



> lim inf ■ 

SNR^oo 



lim inf • 

SNR^oo 

lim inf ■ 

SNR^oo 



ft * m 



log SNR 

log det 

log SNR 
log det [£ (SNR)] 

log SNR 



(26) 
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We use (|8} derive an upper bound on lo f^ c^r ■ First it is easy to see that 

1 



sup J CEfcjE h k 
< sup / ( E ( h k 



hi 



k-l 



h. 



fe-i 



sup I(x k ;y k ). (27) 

P* k ePi (SNR) 



It is shown in [3] that 



sup I(x k ;y k ) = o(logSNR). 

-FWeTMSNR) 



Now we proceed to upper-bound the first term in d27i . 



sup I 

P»,e-Pi(SNR) 



SUp M. ^ log 

Pm k eVi{SNR) 



k-l 



1 + lajfcl 



x k 



< lop 



log 



1+ \x k \ 2 • var ft, 



1 + SNR 



f 1 



1 + SNR • var {h k 

1 + SNR 
SNR ■ (Tfe(SNR) 



r 1 fc_1 



Therefore, 



£ sup J E h fc 
C(SNR) ,. k=i p mh evi(sm) \ \ 

lim sup — — < lim sup — — — 

SNR^i log SNR " snr^oS T log SNR 



k-l 



log 



< lim sup ■ 

SNR^oo 

= lim sup ■ 

SNR^oo 



I! °k (SNR) 

k=l 



log SNR 
logdet [S(SNR)] 



log SNR 

The desired result follows by combining i26i and d28l >. 



(28) 



Define 



Appendix IV 
Proof of Theorem|2] 

CTi(oo) = var (hi {fcfc}fc=-oo J i = 2 > ' ' ' > T - 



Similar to &25\ , we have 



det [E(oo)] = Yl 

(Tj(00). 
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Therefore, det [E(oo)] > implies <7j(oo) > for all i. 
Note that if x m [ n > 6 for some 6 > 0, then 

1 



form a Markov chain, and 



hi 



-z k 



i-l 



y^EUn {h k + Sz k y k J_ 



hi 



hi 



1 



-z k 



k= — oo 



In the lower bound (|3J, let log | £Ki | 2 be uniformly distributed over the interval [log log SNR]. As log a; 
grows sublinearly in logSNR to infinity, we get 

1 



2 

min 



lim inf 

SNR^oo 



> lim inf 

SNR^oo 



I [ x^. h^Xi -\- z l 



I ( X i , hiXi -\- Zi 



hi 



h k 



•'mi 



-z k 



mm ) k =- 



i-1 



- log log SNR 



Eh 



{hk+SzkY-J^ ,)) - log log SNR 



— 1 — 7 — log var ( h 



{hj+Szj} 1 



j=-oo 



where the last equality follows from [3, Proposition 4.23]. Therefore, 



lim inf [C(SNR) - log log SNR] 

SNR^oo 



> lim inf 

SNR^oo 



E hi 



-z k 



i-l 



log log SNR 



T^Z 1 l X i>hiXi + z 

1 T 

= - 1 - 7 - j, lo § var ( hi { h i + 5z o }j=- 

i=l 

Since j29t holds for arbitrary positive 5, it follows that 

1 T 

liminf [C(SNR) - loglogSNR] > -1 - 7 - - lim^logvar (hi {hj + Szj}^ 



-1-7- -log det [£(00)]. 



From the upper bound (|8), we have 



limsup[C(SNR) - loglogSNR] 

SNR^co 



< lim sup 

SNR^oo 



< lim sup 

SNR^oo 



< lim sup 

SNR^oo 



sup I \Xk,E\hk \h t + —L=z t \ ] ;y k ]- loglogSNR 

1 T 

sup /(xfc.E^fcl^^ziJ-.VfcJ-bglogSNR 

-t t P_ CP, CqMR"! 



■ e=1 i^ePi(SNR) 



(29) 



(30) 



(31) 



_P= 0fc G'Pi(SNR) 

where (13 1 i follows from the fact that 



1 

sup 7(:E 1 ;y 1 ) + -V sup 7 (E (h k |{fcj}j=-oo) 5»k| **) ~ loglogSNR 
=-p, rqiMR'i J , c=1 p a , k ev 1 (SNR) 



Xk, E /i 



h t + 



k-l 



(s*,E(/»*|{fci}£ioo))->» 
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form a Markov chain. 

It was shown in [3, Corollary 4.19] that 



lim 

SNR^oo 



sup /(xi;^) - log log SNR 

P«. SPifSNR) 



= -1-7- 



Furthermore, 



/(E(^|{^tioc);y fc |^) < i( E ( h k\{ h i}j=-oo);hk,y k \x k ) 



= /(E^feK^}^);^) + l(®(h k \{h j }*Z l _ 00 );y k \h k ,x k ) 

= /(E^K^-^zioo);^) 

= -logo-fe(oo) 



(32) 



where d32l > follows from the fact that E (hkHhj}^!^ — > (h k ,x k ) — ► y fc form a Markov chain. Therefore, we 
get 

T 

r 



i T 

limsup[C(SNR) -loglogSNR] < -1 - 7 - - ^loga fc (0) 



SNR^oo 



fc=l 



-I-7 - -logdet [E(oo) 



(33) 



The desired result follows by combining f30t and d33t . 



At high SNR, we have 



log 



da; 



=SNR r 



Appendix V 
Proof of Proposition^ 



alog (4) + ^ - a - sm ^ log ( SNR ~ r + s4) 

+SNR l0g ( SNR 7 + SNR J 

-2ttk log SNR + c(r) + o(l) 



and thus 



where k 



var /ii 



a+min(r,l) (27T— a) 
2tt ' 



\/SNR 2:i j fe _ v 



and 



s s (ei"),6=SNR r 



SNR K ~ SNR +0 { SNR K 



(34) 



a log 4 re (0, 1) 

c(r) = ^ a log 4+ (27r-a)log5 r = l 

27rlog4 r € (l,oo). 
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In the lower bound dl4> . let Xi be uniformly distributed over the set {z e C : ^f^ < \z\ < VSNR}. By Lemma 
12 and Equation (I34> . we get 



C*(SNR) > I [xi;hixi + zi 



E hi 



hi 



> -log 



e 2» 4 



SNR K SNR 5SNR 
KlogSNR + o(log SNR). 



SNR h 



, / e^? 4 / 1 

log 1 - ^rrrr + Trrrrr - O f 



SNR SNR \SNRk 



1 - lo; 



5e 



Therefore, 



SNR^oo,e=SNR'' log SNR 

When r > 1, we have 



.. . . C(SN R )l. 9 («i») ^ a + min(r,l)(27r-a) 
lim ml — — ; > k = 



2tt 



SNR^oo,6»=SNR'- log SNR 

Since the noncoherent channel capacity with peak power constraint |a;| 2 < SNR is upper bounded by the coherent 
channel capacity with average power constraint E|a;| 2 < SNR, it follows that 

C(SNR)| se(ei „ ) <E h log(l + SNR|ft,| 2 ) for all 9. 

where h ~ C7V(0, 1). Therefore, 



lim sup 



C(SNR)| se(e>0 ^ w _ E h log(l + SNR|h| ; 



< lim 

SNR^oo,e=SNR'' log SNR SNR^oo 



log SNR 



The proof is complete. 



Appendix VI 
Example|4] 

In the lower bound dl4> . let Xi be uniformly distributed over the set {z £ 
13 we get 

2 



/SNR 



<\z\< VSNR}. By Lemma 



where 



C(SNR) > J I x^hxxx + zi 
> -log ^var Uii 
+ log ( 1 - var (^fai 

var hi 



hi 



hi 



/SNR Zl ^ : 



/ SNR 



fc— — oo 




5SNR 



hi 



/SNR 



7 - loj 



be 



hk H ^=zi 

VSNR j 



£i + snr) ( £2 + snr) 



1 — aiei — (Q!2 — ai)t2 4 
l-a 2 SNR 



4 
SNR 



(35) 
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By specializing the upper-bound (|8j to the case where T = 1, we obtain 



C(SNR) < 



sup / I xi,E I hi 



1 



VSNR 



< 



< 



sup I(xi;y 1 ) + sup / E I hi 

Pu> 1 eVi(SNR) ' P mi eVi(SNR) \ \ 

sup I(xi;y 1 ) + sup / E I hi 

Pk 1 £T'i(SNR) P a!1 eV 1 (SNR) V V 



h t 



hi 



/SNR 



Xi 



\hi,yi 



Xi 



sup I(xi;yi) + 1 E [hi 

p^evtism) y y 



where 



/SNR 



-.Zl 



, hi 



= -log 



= - log 



var hi 



t — — oo I 




h t 



/ SNR Z *,^ , 



;i + snr) G 2 + snr) 



1 - ai6i - (q 2 " Qi)e2 1 
1 - a 2 SNR 



H 1 — «2 



SNR 



(36) 



Note that supp^ ieT3l ( SNR ) I(xi;yi) is the capacity of the memoryless noncoherent Rayleigh fading channel (see 
[3] for a nonasymptotic upper bound), and we have sup^^g-p^si^ I(xi; y x ) <C log SNR for large SNR. 

By J35I and ( I36> . it is not hard to verify that ^ S g^ is approximately equal to «2 for 1 <C SNR < and is 



approximately equal to ai for log ^ <§; log SNR < log j-. 



By the random coding bound [20], we have 



Appendix VII 
AWGN Channel 

p < e -nE r (R) 



where 



E r (R) 



SNR 



efl + l_(e«_i)Ji + 



if log 



1 I SNR i 1 



SNR 2 



SNR(e- R - 1 
<R< log(l + SNR), and 



log< e 



H SNR(e fl - 1) 



4e R 



5NR(e R - 1) 



SNR / SNR 2 , / 1 SNR 1 / SNR 2 

Er(R) = l + — -Jl + ^+log --_ + + — 



(37) 



, .1 SNR 1 / SNR 2 \ „ /oox 
l0g l2 + — + + ^ " fl(38) 



if R < log 



1 , SNR , 1 

2 ' 4 ' 2 



SNR 2 
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Let i?(SNR) = logSNR - log?/ where r\ £ (1, 2). By 03, 



lim E r (R(SNR)) 

SNR^oo 



lim — 

SNR^oo 2 

lim — 

SNR^oo 2 



SNR / SNR \ / 4 

— + 1 H~ _1 JV 1+ SNR^ 

SNR SNR - ry / 2 
+ 1 L 1 + FTio 

77 77 V SNR - 77 



+ log | 



f SNR SNR(SNR - r?) 



2>] 



log | 



f SNR SNR(SNR - 77) 



2r? 



SNR -7/ 

2 2 



SNR - 77 (SNR - 7/) 2 



= 77 - 1 - log r; 
> . 

For any P e > 0, we can find an n such that 

e -n(»7-l-logJ7) < 

Therefore, for any P e > 0, there exist a sequence of codebooks with rate i?(SNR) and fixed codeword length 77 
such that 

Km ?m=i 



and limsup SNR ^ oc P e (SNR) < P e 



SNR^oo log SNR 



Appendix VIII 
Coherent Rayleigh Fading Channel 



It was shown in [21] that 



E r (R) = max 

0<p<l 



1 + p 



pR 



where h ~ CM(0, 1). 

Choosing i?(SNR) = log SNR - log log SNR - c and p = 1, we get 

SNR 



liminf £y(i£(SNR)) > liminf 

SNR^oo SNR^oo 



lim inf 

SNR^oo 



logE h 1 



1 + P 



hi 



logSNR + loglog SNR + c 



logE h I — + ~\h\' ) + log log SNR + c 



lim inf < — log 
Snr—oo 



lim inf < — log 

SNR->oo 



> lim inf < — log 

SNR^oo 



(. 



1 t 



e-*di 



o V SNR 2 

o{m + l) lsNR + 2 1 e ~ tM 



log log SNR + cj 



log log SNR 



1 / 1 + \ 1 r°° 

- — + - dt+ 2e"*dt 
VSNR 2) A 



log log SNR + c 



= Urn inf [- log [2 log(2 + 2SNR + SNR 2 ) - 2 log(2 + 2SNR) + 2c" 1 ] + log log SNR + c] 



= - log 2 + c 
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which is positive if c > log 2. 

Therefore, for any P e > 0, we can find a sequence of codebooks with rate i?(SNR) and fixed codeword length 
n such that 



lim « = 1 
SNR->oo logSNR 



and limsup SNR ^ 00 P e (SNR) < P e 



Appendix IX 

EXAMPLE0 

We can compute that 

¥({1},SNR) = ¥({2}, SNR) = ¥({3}, SNR) = 27rlog(l + SNR), 
¥({1, 3}, SNR) = ¥({2, 3}, SNR) = 7rlog(l + 2SNR + SNR 2 - H 2 SNR), 
¥({1,2}, SNR) = vrlog(l + 2SNR), 

¥({1, 2, 3}, SNR) = ^ log(l + 3SNR + 2SNR 2 - 2|p| 2 SNR 2 ). 

It can be verified that 

¥({1},SNR) = ¥({2}, SNR) = ¥({3}, SNR) > ¥({1,3}, SNR) = ¥({2, 3}, SNR) > ¥({1, 2}, SNR). 
So the optimal M* is either {1,2} or {1,2,3}. Setting ¥({1,2}, SNR) = ¥({1, 2, 3}, SNR) yields 

(1 + 3SNR + 2SNR 2 - 2|p| 2 SNR 2 ) 2 = (1 + 2SNR) 3 
which, after some algebraic manipulation, is equivalent to 

1 - 4|p| 2 + (4 - 12|p| 2 )SNR + (4 - 8|p| 2 + 4|p| 4 )SNR 2 = 0. 
The above equation has two solutions 

SNR 1= SNR 2 ^l" 1 



2(i + H)2' 2(i-\ P \y 

SNRi can be discarded since it is always negative. SNR2 is positive for \p\ € (|,1). When \p\ £ ( i , 1), it can 
be verified that ¥({1,2}, SNR) > ¥({1, 2, 3}, SNR) if SNR < SNR 2 , and ¥({1,2}, SNR) < ¥({1, 2, 3}, SNR) if 
SNR > SNR 2 . When \p\ G [0, |], SNR 2 is non-positive. In this case, we have ¥({1,2}, SNR) > ¥({1, 2, 3}, SNR) 
for all SNR > 0. 
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